CLK1

Dual Specificity Protein Kinase Cell Division Cycle2-Like 1 (PBD: 5J1V) from Homo sapiens

Created by: John Wilding

The protein known as dual specificity protein kinase cell division cycle2-like 1 (CLK1) (PBD ID: 5J1V) is a dual specificity kinase found in Homo sapiens (1). Primarily found in the nucleus, dual specificity kinases phosphorylate both serine/threonine and tyrosine containing protein substrates via adenosine triphosphate (ATP) hydrolysis which produces adenosine diphosphate and a phosphoprotein (1). CLK1 phosphorylates serine(S)- and arginine(R)-rich proteins of the spliceosomal complex and is a crucial enzyme in a network of regulatory mechanisms that enable SR-rich proteins to control RNA splicing(1). CLK1 particularly phosphorylates the serine residues in SR-rich proteins that determine pre-mRNA splicing sites of microtubule-associated protein tau which is implicated in Alzheimer’s disease and Parkinson’s disease (2). Drug inhibitors for CLK1 could potentially reduce further neurodegeneration of Alzheimer’s disease and Parkinson’s disease patients by preventing tau microtubule tangles that cause neuron cell death (3). For CLK1, pyrido[3,4-g]quinazoline derivative ZW290 (PQZ) successfully inhibited the ATP binding socket and allowed X-ray crystallography for all three CLK1 chains; A, B, and C (3). Besides PQZ, glycerol is another ligand present in the crystal structure (1).

Several online tools were used to determine CLK1’s features and find similar proteins based on CLK1’s primary and tertiary structures. The CLK1 protein weighs 118513.27 Da and exhibits an isoelectric point of 6.75 (4). CLK1’s protein sequence was run through two programs, PSI-Blast and Dali server, to find comparative proteins for relevant structural and functional analysis.

The PSI-Blast program compares a protein’s primary structure to other known protein primary structures and creates a list of similar proteins based on their amino acid sequences. Each compared protein is assigned an E value relative to the protein of interest (5). Gaps in similarity between the protein of interest and other proteins are calculated and determine the protein E values relative to the protein of interest. CLK1 was assigned an E value of 0; proteins with an E value under 0.05 are considered very similar to the protein of interest. Serine/Threonine-protein kinase YMR216C (PBD ID: 1HOW, S/TPK) has a relative E score of 5e-26 to CLK1 and S/TPK therefore has a significantly similar protein sequence to CLK1 (5). S/TPK is used later in this paper as CLK1’s comparative protein.

The Dali server compares protein tertiary structures and calculates a Z-score based on the differences in tertiary structure intramolecular distances (6). The Dali server utilizes the “sum-of-pairs” method to compare tertiary structures and creates relative Z-scores based on similarities found between the protein of interest and other proteins (6). The comparative protein S/TPK had a Z score of 34.0 (6). A Z-score over two implies that the comparative protein has similar folds to the protein of interest, and exhibits similar tertiary structure (6).

CLK1 is a trimer consisting of three monomers that are nearly identical to one another, subunit A, B, and C (1). Both subunits A and B have a complete protein sequence, however subunit C’s residues from 340-342 and 411-430 are unknown and subunit C has a slightly different structure from subunits A and B (1). Subunits A, B, and C are bound as a trimer with various bond types; salt bridges, hydrogen bonds and non-bonded contacts (7).

Subunits B and C exhibit all three types of bonds and are bound by a salt bridge from Lys-283 (B) to Glu-229 (C). Hydrogen bonding occurs between Arg-469 of subunit B to Tyr-331 of subunit C. Thirty non-bonded contacts occur between four residues of subunit B and five residues of subunit C (7). Subunits A and B exhibit only five non-bonded contacts with His-335 of subunit A connected to Lys-482 of subunit B and Lys-405 from subunit A connected to Asn-219 from subunit B (7). Subunit A and C have two hydrogen bonds and sixteen non-bonded contacts (7). There are two hydrogen bonds between Thr-338 of subunit A and His-335 from subunit C (7). Sixteen non-bonded contacts occur between six residues from subunit A and four residues from subunit C (7). There were no disulfide bonds between the subunits (7).      

CLK1  subunits A and B have the same secondary and tertiary structures. Subunit A and B each have five beta sheets, six beta hairpins, six beta bulges, fourteen random coils, seventeen alpha helices, twenty two helix-helix interactions, twenty seven beta turns, and four gamma turns (7). Subunit C has four beta sheets, five beta hairpins, six beta bulges, twelve random coils, thirteen alpha helices, twenty two helix-helix interactions, twenty four beta turns, and three gamma turns (7).

Each subunit has an ATP binding site and participates in phosphorylating SR-rich proteinsKey residues are Lys-191 and Asp-288 which act as the ATP binding site and a proton acceptor, respectively (1). The ATP binding socket is located at Lys-191 for each subunit, which is bound to ligand pyrido[3,4-g]quinazoline (PQZ) for X-ray crystallization of CLK1 (1, 3).  For subunits A and B, PQZ has amine hydrogen bonds with the amine base on Lys-191 and has two hydrogen bonds with Leu-244, one with the residue’s C-terminal and another with the N-terminal (7). Van der Waals forces from residues Leu-167, Val-175, Ala-189, Phe-241, Glu-242, Leu-243, Leu-295, and Val-324 also interact with PQZ and provide additional binding to the ATP binding socket or groove (7). In subunit C, the hydrogen bonding residues are the same, in that both Lys-191 and Leu-244 still form three total hydrogen bonds with PQZ. However, the two residues are closer together on one side of PQZ and the substrate is not as “surrounded” by the binding residues (7). Van der Waals forces from Leu-167, Val-175, Ala-189, Phe-241, Glu-242, Leu-243, Leu-295, and Val-324 also bind to PQZ, however subunit C’s tertiary structure varies from subunit A and B which yields an exposed binding groove (7).  

Besides the ATP binding groove inhibited by PQZ, glycerol is the other ligand present. The glycerol molecules are not important, but they are present because glycerol was the solvent used to prepare the CLK1 – PQZ complex for crystallization (3). Glycerol is present on subunits A and C. Glycerol exhibits hydrogen bonding with the hydroxyl groups from Ser-384 and Tyr-411 on subunit A (7). Glycerol also exhibits hydrogen bonding with the hydroxyl group from Ala- 183 and one of the amines from Arg-186’s side chains in subunit C (7). The glycerol bound to subunit A is in the pore between subunit A and C, however the glycerol bound to subunit C is located between two beta sheets in proximity to the ATP binding site groove (7).

For each CLK1 subunit, the N-terminal lobe consists of three beta strands followed by an alpha helix and two more beta strands (8). The C-terminal lobe has an alpha helix at the bottom of the lobe that is solvent-inaccessible from a large insertion between residues 400-432, as seen in Figure 1 (8). The region displays a helix-loop-two strand beta sheet followed by an alpha helix that is unique to CLK structures (8). The C-terminal lobe top also has a unique insertion where residues 300-317 form a beta-hairpin (8).

S/TPK tertiary structure is similar CLK1’s structure however S/TPK is a monomer and not a trimer. S/TPK resembles other protein kinases and CLK1 which was described previously; the N-terminal lobe is made primarily of beta sheets and the C-terminal is made of alpha helices, as seen in Figures 2 and 3. However, S/TPK has four non-kinase core segments (9). Residues 153 and 154 from the N-terminus form a short beta-strand (beta 0) that extends the small lobe beta-sheet to six antiparallel strands (9). The strand beta 0 is preceded by an eight-residue segment in an extended conformation that caps off the small lobe. The C-terminus of S/TPK also extends beyond the kinase core and wraps around the bottom of the small lobe before terminating near the activation loop (9). This C-terminal extension plays a role in maintaining the constitutive activity of S/TPK. In addition to extensions at both ends of the kinase core, there are also two inserts within the core. The first is an 11-amino acid insert (the alpha C' insert) between helix alpha C and beta 4. This segment includes a small loop that deviates from the small lobe, forms a seven-amino acid helix (alpha C') that interacts with the helix alpha E in the large lobe, and then rejoins the small lobe. This insert appears to help stabilize the orientation of helix alpha C (9). The second insert is 47 amino acids long and lies between helix alpha G and helix alpha H in the large lobe (9). S/TPK is constitutionally active like CLK1 since both of the proteins automatically bind and position ATP correctly for phosphorylation.  Both proteins consist of two lobes; the N-terminal lobe with numerous beta sheets and the C-terminal lobe with more alpha helices than beta sheets. Both CLK1 and S/TPK have alpha helices wrapping around beta sheets which form the hollow insertion for ATP. The crucial differences in tertiary and secondary structure were explained above, however substrate specificity separates their functions. S/TPK specificity depends on the Glu-294 residue, as its mutation to Serine would stop all phosphorylation (9). CLK1 reacts to ATP presence by Glu-206 and Lys-191 binding to the substrate and the acidic A loop is stabilized by polar contacts during phosphorylation (8). Additionally, of the CLK C-terminal lobe is a long insertion between the two sheets β7 and β8. This insert forms the CLK-specific βhp-βhp′ hairpin that folds over a shallow groove created by the helices alpha D and alpha E as seen in Figure 1 (8).

CLK1 is a dual specificity protein kinase that is responsible for neuronal differentation in Homo sapiens (8). Alternative DNA splicing is controlled by phosphorylation of serine and argine-rich splicing factors (8). CLK1 in paricular targets serine and arginine-rich substrates that have entered the nucleus which would then be phosphrylated at the serine-arginine rich sites (8). The activated protein would then move to be spliced and dephosphoryalted (8). Protein splicing regulation is responsible for regulating a single gene's generation of numerous protein isoforms (8). CLK1 efficacy is believed to be related to hereditary neuronal diseases, such as Alzheimer's disease (AD) or Parkinson's disease (3). Deficiencies in CLK1 dual specificity binding to specific serine-arginine rich protein binding sites in neurons, caused by hereditary AD mutations, could lead to tau microtubule tangle formations which induce neurodegeneration in AD patients (2, 3).